Experience with a Combined Approach to Attribute-Matching Across Heterogeneous Databases
نویسندگان
چکیده
Determining attribute correspondences is a difficult, time-consuming, knowledgeintensive part of database integration. We report on experiences with tools that identified candidate correspondences, as a step in a large scale effort to improve communication among Air Force systems. First, we describe a new method that was both simple and surprisingly successful: Data dictionary and catalog information were dumped to unformatted text; then off-the-shelf information retrieval software estimated string similarity, generated candidate matches, and provided the interface. The second method used a different set of clues, such as statistics on database populations, to compute separate similarity metrics (using neural network techniques). We report on substantial use of the first tool, and then report some limited initial experiments that examine the two techniques’ accuracy, consistency and complementarity.
منابع مشابه
HeteroClass: A Framework for Effective Classification from Heterogeneous Databases
Classification is an important data mining task and it has been studied from different perspectives. Recently multi-relational classification algorithms has been studied due to many real-world applications. However, current work has generally assumed that all the needed data to build an accurate prediction model resides in a single database. Many practical settings, however, require that we com...
متن کاملSEMINT: A tool for identifying attribute correspondences in heterogeneous databases using neural networks
One step in interoperating among heterogeneous databases is semantic integration: Identifying relationships between attributes or classes in dierent database schemas. SEMantic INTegrator (SEMINT) is a tool based on neural networks to assist in identifying attribute correspondences in heterogeneous databases. SEMINT supports access to a variety of database systems and utilizes both schema infor...
متن کاملAn Integrated Geophysical Approach for Porosity and Facies Determination: A Case Study of Tamag Field of Niger Delta Hydrocarbon Province
Petro physics, rock physics and multi-attribute analysis have been employed in an integrated approach to delineate porosity variation across Tamag Field of Niger Delta Basin. Gamma and resistivity logs were employed to identify sand bodies and correlated across the field. Petro physical analysis was undertaken. Rock physics modelling and multi-attribute analysis were carried out. Two hydrocarbo...
متن کاملCombining Multiple Query Interface Matchers Using Dempster-Shafer Theory of Evidence
Matching query interfaces is a crucial step in data integration across multiple Web databases. The problem is closely related to schema matching that typically exploits different features of schemas. Relying on a particular feature of schemas is not sufficient. We propose an evidential approach to combining multiple matchers using Dempster-Shafer theory of evidence. First, our approach views th...
متن کاملContext-aware Modeling for Spatio-temporal Data Transmitted from a Wireless Body Sensor Network
Context-aware systems must be interoperable and work across different platforms at any time and in any place. Context data collected from wireless body area networks (WBAN) may be heterogeneous and imperfect, which makes their design and implementation difficult. In this research, we introduce a model which takes the dynamic nature of a context-aware system into consideration. This model is con...
متن کامل